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Factoring Large Integers 

By R. Shennan Lehman 



Abstract. A modification of Fermat's difference of squares method is used for factoring 
large integers. This modification permits factoring n in 0{n^'^) elementary operations, where 
addition, subtraction, multiplication, division, or the extraction of a square root is con- 
sidered as an elementary operation. A principal part is played by the use of a dissection of the 
continuum similar to the Farey dissection. This has been programmed for « ^ 1.05 X lO^o 
on the CDC 6400. 



1. Introduction. Fermat's method for factoring an odd positive integer n 
consists of finding n = — where x and y are positive integers. We find in succes- 
sion 

x= x= [«^/^] + 2, ... 

and determine whether the difference — az is a square or not. If p and q are primes 
and n = pq, then Fermat's method is quite efficient if p/q is near 1, but it requires 
a large number of trials if p/q is not near 1. Lawrence [2] used a method which is 
designed to be efficient if p/q is near a/b where a and b are small relatively prime 
integers. 

We consider — y"^ = 4kn, k = ab with 1 g fe g r. The idea we wish to use is 
to divide up the interval [0, 1] into parts. Each part will correspond to a fraction 
a/b, and these parts will fill the interval [0, 1]. This means that, for each r, we find 
a sequence Sr which includes a/b when O^a^b, b>0 and ab g r. This is remi- 
niscent of the Farey sequence of order r. We prove in Section 3 that many of the 
ideas go over to the new sequence Sr, In particular, one obtains a dissection of the 
continuum similar to the Farey dissection of [0, 1]. 

The main theorem is given in Section 2. Its proof is contained in Section 4. Nu- 
merical results were obtained by a computation on the CDC 6400 of the Computer 
Center of the University of California at Berkeley. An Algol program is also given 
in Section 5. 

2. The Theorem. We shall use gcd(a, b) for the greatest common divisor of 
a and b. 

Theorem. Suppose that n is a positive odd integer and r is an integer such that 
\ ^ r < n^^^. If n = pq where p and q are primes and 

{n/{r + 1))^/^ <p^ n'^\ 
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then there are nonnegative integers x, y and k such that 

- = 4kn, I ^ k ^r, 
^2 J) x^k+1 (mod 2), 

X = k + n (mod 4) // k is odd, 
0 ^ X - (4^/i)'^' ^ (l/4(r + l))(/i/A:)'^' 

and 

(2.2) p = min(gcd(A: + y, n), gcd{x - y, n)). 

If n is a prime, then there are no integers satisfying (2.1). 

Let us see how many elementary operations are required to obtain the primes p 
and q when n = pq. First, there are a constant tunes (n/(r + 1))^^^ divisions involved 
to determine whether there is a small prime factor less than (n/(r + 1))^^^. We find 
that there are 

0((n/rr') + E 0((l/r)(n/kr' + D 

elementary operations, where the extraction of a square root is counted as one 
operation. We have 

0((n/ry'') + Oai/rW'V'') + 0(r) 

operations. Here, if we choose r to be a constant times n^^^, we find 0(n^^^) ele- 
mentary operations are required. 

3. The Sequence 5^. If r is a positive integer, then we denote by Sr the sequence 
of rational numbers a/b where O^a^b, b>0 and ab ^ r with a and b relatively 
prime integers. We suppose that the sequence is arranged in order of increasing 
size. For example, Sis is the sequence 

0J_J^JLJ_J^J_11111121213231 
1 ' 15 ' 14 ' 13 ' 12 ' 11 ' 10 ' 9 ' 8 ' 7 ' 6' 5 ' 4 ' 7 ' 3 ' 5 ' 2 ' 5 ' 3 ' 4 ' r 

Lemma 1. If a/b and a' fV are two successive terms of Sr, then 

a'b - ab' = 1 and (a + a'){b + b')> r. 

Proof It is well known that the Farey series of order n, which consists of all 
reduced fractions between 0 and 1 whose denominators do not exceed «, can be 
generated starting from 0/1, 1/1 by the following process: Between two succes- 
sive terms of the sequence generated, say a/b and a'/b', insert their mediant 
{a + a')/{b + b')y which is always a reduced fraction, whenever b + b' does not 
exceed w. A similar method can be used to generate Sr — we insert the mediant 
{a + a')/{b + b') whenever {a + a^b + V) ^ r. It follows that two successive 
terms of Sr are successive terms in a Farey series of some order and thus a'b ab' = 1 . 
To avoid insertion of the mediant (a + a')/(b + b') between them, we must have 
{a + a'Xb + > r. This completes the proof. 

We now use a dissection of the interval [0, 1] which is analogous to the Farey 
dissection of the continuum (see [1, p. 29]). We take the sequence Sr and form the 
mediants between each two successive terms. We then cut up the interval [0, 1] into 
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pieces using the mediants as division points. Thus, we obtain a subinterval cor- 
responding to each term of Sr. It will be convenient to use closed subintervals. Cor- 
responding to 0/1, we have the subinterval [0, l/(r + 1)], and, corresponding to 
1/1, we have the subinterval [(a* + l)/(6* + 1), 1] where a*/b* is the term preceding 
1/1 in Sr. If ayb\ a/b, and are three successive terms of S^, then, corresponding 
to a/b, we have the subinterval 



[ a + a' a + a'^ l 
Ib + b'^b+b'^y 



By Lemma 1, we have 



n n a+ a' ^ ^ 1 a + = ^ i 1 

^^'^^ b+b' b b(b + b')' b + b'' b^b{b+b") 

We shall call this dissection with subintervals corresponding to Sr the dissection of 
order r. 

Lemma 2. If a is in the subinterval corresponding to a/b with a > 0 in the dis- 
section of order r, then 

I {1 - 5(1 + i8Y' + g « ^ f {1 + 5(1 + idY' + ^8'] 

where 8 = [abir + 

Proof Let a'/fi' be the term preceding and the term following a/b in 5r, 
and suppose that a is in the subinterval corresponding to a/b with 6 ^ a ^ 1. Since 
the mediant (a + a')/{b + b') is not in Sr, we have, by (3.1) and Lemma 1, 

r + 1 ^ (a + + b') = (* + b'f = ^ (6 + b'f - 



Similarly, we have 



r+l^lib+b"f + ^^-±P- 



Using the first of these quadratic inequalities and that + ft' > 0, we obtain 
ft + ft' ^ {1 + (1 + 4a6(r + \Y^]/2a 

and 

b{b + b') - ?'l + (l + 4L(r+ir^ = I + + 

Hence, by (3.1), we obtain the first inequality of the lemma. Similarly, using the 
second of these quadratic inequalities, we obtain 

ft + ft" ^ {-1 + (1 + Aab(r + l))'''}/2a 

and 

From this, we obtain the second inequality of the lemma. This completes the proof. 
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4. Proof of Theorem. Let n be an odd prime or let n = pq where p and q are 
two odd primes with p g n^^^ ^ q. Consider the equation 

(4.1) (;c + y)(x - y) = x' - / = 4kn 

where x and y are nonnegative integers and fc is a positive integer. Then 

(4.2) X + y =^ sa'n, x — y = tb' or x + y = tb\ x — y = sa'n, 
where s, /, a' and 6' are positive integers and st = 4, = k; 

(4.3) X + y = sa^q, x — y = tb'p or x + y =^ tb'p, x — y = sa!q, 

where 5, d' and fe' are positive integers and st = 4, = fc. 

To consider (4.2), we add the two equations and we get in either case Ix = sa!n + 
tV. There are three possible cases: 5 = 4,/= 1; 5 = 1, / = 4; 5 = 2, / = 2. These 
give 

X = la!n + ;c = \a'n + 2^, a: = a'/i + V , 

In the first case, we see V is even. Setting a = 2a\ b = ^b\ we get 

(4.4) X = an + b with = A:. 

In the second case, we see that a'n is even, and because n is odd, a' must be even. 
Setting a = 6 = 2b\ we again get (4.4). In the last case, we obtain (4.4) with 
a = a\b = b\ 

We prove that if r is an integer such that 1 ^ r < n^^^, then 

- - (4A:«)'- > (^)*^ 1 ^ A: ^ r. 

is correct. This contradicts one of the inequaUties in (2.1). Actually, we prove the 
stronger inequality 

(4.5) . - iAknr > (^y\ 
It is equivalent to 

an + b> 2k'''n''' + ^ (^J'', 1 ^ k ^ n - 2, ab = k, 

by (4.4). Squaring both sides, we obtain 

fl'/i' - 2kn + b^ > n/(k + 1) + n/16(k + ifk. 

We see that it can be reduced to a special case a = I, b = k when the left side is 
- 2kn + k\ For, if a ^ 2 and 1 ^ A: g n - 2, then 

{a^ - 4)/i' + (/i' - k') + + 2/1^ ^ 0 

or 

a'/i' - 2A:/i + ^ /i' - 2kn + ^^ 

Thus, it is sufficient to consider (n - kf > n/(k + 1) + «/16(fe + l)^fc where 1 g 
^ n — 2. A stronger inequality is 
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(n - kf - (l + ^ 0 or G(k, n) = (k + l)(n - kf - (1+ ^ 0, 

where 1 ^ ^ n — 2, n ^ 3 for all real values of k and w. Differentiating G(fc, n) 
with respect to fc, we have 

dG{k, n)/dk = (/I - kf - l{k + - A:) = (/I - A:)(/i - 2 - 3A:). 

For fixed n, we see that dG/dk = 0 at = n and fe = (n - 2)/3, and that G{k, n) 
is increasing for 0 ^ fc < (« — 2)/3 and decreasing for (« — 2)/3 < k < n. Thus, 
it is sufficient to check it on the two rays fc = 1, n ^ 3 and = n — 2, n ^ 3. We 
have 

n) = 2(/i - 1)^ - + 1^)'^ = 2/1^ - 4/1 + 2 - (1 + 
^ 6/1 - 4/1 - (1 + + 2 > 2 

and 

G(/i - 2, /I) = 4(/i - 1) - (1 + ^ 8 - 3(1 + ^) > 4. 

It follows that it remains positive, and thus (4.5) is proved. We have shown that there 
is no solution to (2.1) when /z is a prime. 

Now, we consider (4.3). Adding the two equations, we get Ix = sa!q + tVp, 
There are three possible cases: 5 = 4,/= 1; 5 = 1, / = 4; 5 = 2, / = 2. These give 

X = la'q + \b'p, X = ha'q + Ib'p, x ^ a'q + b'p. 

In the first case, we see that b^p is even and because p is odd, 6' must be even. Setting 
a = 2a', b = \V, we have 

(4.6) X — aq + bpy y = \aq — bp\^ k = ab, 

with a and b positive integers. In the second case, we see that a'q is even and because 
q is odd, a' must be even. Setting a = Ja', b = 2b\ we get (4.6). In the last case, we 
get (4.6) with a = a',b = b\ 
Let d = gcd(a, 6). Then 

(4.7) X = aq -\- bp = d(aiq + bip), y = \aq — bp\ = d \aiq — bip\ 
where ai and bi are positive integers, and 

(x/df - (y/df = 4aiAi/i, A: = d\b^. 

Thus, it can be reduced to the case in which a and b are relatively prime positive 
integers. 

If a and b are relatively prime, then we can prove that x = fc + 1 (mod 2). If 
k = ab is even, then one of the integers aq and bp is even and the other is odd since, 
by assumption, p and q are odd while a and are relatively prime. It follows that 
x = aq + bp is odd. On the other hand, if k is odd, then the integers aq and bp are 
odd. It follows that x is even. 

We can prove that if k is odd, then x = fc + w (mod 4). We consider p and q 
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which are odd and also a and b which are odd. Then /? — a is even and ^ — & is even. 
Hence their product is divisible by 4. Hence, (p — aXq — b) = pq + ab — aq — bp =^ 
n + k — xis divisible by 4. 

Since ab = k < n^^^ ^ g, we have 

p = gcd(2^p, n) = min(gcd(j: + y, /i), gcd(:t - y, n)) 

where any solution of (4.6) is used. 
It remains to prove that 

(4.8) og.-(4.«)-^^(ff^ 
where 

(4.9) p > (n/(r + l)y'\ 

Let m = 4kn = 4abn, and let r = x — m*^^ Because the arithmetic mean is not 
less than the geometric mean 

X ^ aq + bp^ 2(abpqy^^ = m'^^ 

and thus r ^ 0 which proves the left half of (4.8). 
Letting € = rnT^^^, we have 

;c = (1 + y = (2e + eY'm''\ 

The right half of the inequality (4.8) translates into 

(4.10) € g ^8' 
where 6 = {ab(r + 

We now show that the point a = p/q lies in the subinterval corresponding to 
a/b in the dissection of order r discussed in Section 3. In applying Lemma 2, we 
must show that p/q does not lie in the interval [0, l/(r + 1)]. This follows from 

q n/p n r + 1 



by (4.9). We obtain 
where 



i,a/b ^ p/q g i,a/b 



= 1 - a(l + \bY' + \b\ 

& = 1 + 5(1 + \hY' + h^' 
are the two positive roots of the equation 
(4.11) (1 - if = 

We consider separately two cases depending on whether p/q ^ a/b or p/q > a/b. 
First, if p/q g a/b then aq ^ bp, and, by (4.6), we have 



P _ ^(^ - _ ^ / I + € ~ (26 + €Y' \ ^ « . . 2v 

g " + >.) ~ ^ ll + € + (26 + eY') - ^ (1 + « - (2« + O 
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Thus, 

^1 ^ (1 + 6 - (26 + .Y'f or + (26 + ^1 + 6. 

Squaring both sides, we have 2?/^^(26 + ^f^^ ^ 1 — Using that is a root 
of (4.11), we find 

(4.12) 26 + 6' g \h\ 

Solving this inequality for 6, we obtain 

6 g -1 + (1 + \bY" ^ -1 + (1 + = \h\ 

which proves (4.10). 

Second, if p/a > a/b, then bp > aq and 

^ = ^(l + e + (26 + 6r^)^ 
q b 

Then ?2 ^ (1 + e + (26 + ^y^y. From this, we obtain (4.10). This completes 
the proof of the theorem. 

5. The Program and Results. The program was first written in Algol with- 
out use of any recursive procedures. It was planned that after testing the program, 
it would be transferred over to Fortran IV which has available double precision 
routines. This transfer was feasible because the computation preserves integers. 

A dissection of order r is given by a sequence Sr. Therefore, r must be chosen 
appropriately. We chose r = [0.1 n^^^] which is nearly the optimal value. Con- 
sequently, we are looking for factors which are greater than (n/(r + 1))^^^ ^ IQ^^^n}^^, 
We obtained a Fortran routine which is valid for n ^ 1.05 X 10^° and which requires 
at most 1.4 X 10"*^'^^ seconds on the CDC 6400. 

Professor Rene DeVogelaere furnished me with some integers of from 17 to 21 
digits which he wished to factor. In Table I, they are given with the results. We give, 
along with the factor, the resulting k where — = 4kn and x and y are integers. 
The time is given in seconds for the final version. 

In our discussion of the program, we give only the Algol procedures. The first 
procedure is for finding x = a (mod b) where x is the least nonnegative residue of a 
modulo b where a and b are positive integers. The second is for finding the gcd(a, b) 
where a and b are positive integers. The third is a procedure isqrt(n, u) which gives 
as its value the smallest positive integer j such that f ^ n and gives to u the cor- 
responding value of f — n. This procedure uses the real procedure sqrt(n) hence 
it may be in error. It is designed to correct this error. 

We give the procedure factor(n, r, /). We enter the procedure by giving n and r 
and leave it with / assigned a factor. Also, if no factor has been found, then / is set to 
be equal to 1. 

In going through the integers k from 1 to r, there is an advantage in going through 
them in a prescribed order. Let d(k) be the number of positive divisors of k. If a/b is 
closest to the ratio of the divisors of n, which k = ab should we try first? As an 
example we take from Table I the first example 

k = 23220 = 2'-3'-5-43. 
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Table I 



Number 


Factor 






k 




Time 
in 
seconds 


1123877887715932507 


299155897 


23220 




22.3^ 


•5-43 


2.6 


1129367102454866881 


25869889 


6750 


_ 


2-3^- 


5^ 


1.3 


29742315699406748437 


372173423 


25982 


- 


2-11- 


1181 


122.6 


35249679931198483 


59138501 


14554 




2-19- 


383 


17.8 


2081276557 34009 353 


430470917 


21390 




2 • 3 • 5 


• 23* 31 


1.9 


331432537700013787 


114098219 


14664 




2^-3- 


13-47 


6.0 


3070282504055021789 


1436222173 


100620 




2^.32 


•5-13-43 


7.2 


3757550627260778911 


16053127 


131229 




32.7. 


2083 


175.5 


24928816998094684879 


347912923 


82380 




22.3. 


5-1373 


8.3 


10188337563435517819 


70901851 


18240 




2^3. 


5-19 


3.0 



Thus, there are d(k) = 3-4-2-2 = 48 different representations a/b that we look at 
simultaneously. Clearly, it is better to first choose k with d(k) large. For that reason, 
we chose to look at multiples of 

30 = 2-3-5, 24 = 2'-3, 12 = 2'-3, 18 = 2-3', 6=2-3,2,1. 

The program is designed to go through these sequences. 

We have set the Boolean array qr so that qr[/] is true if / is a quadratic residue 
modulo 729 = 3® and is false otherwise. We have picked 729 so that the proportion 
that is true is only 274/729 = 0.38. For this proportion, we must do the additional 
work of finding isqrt(i/, t). 

integer procedure mod (a, b); value a, b; integer a, b; mod: = a— (a-i-b)Xb; 
integer procedure gcd(«, b); value a, b; integer a, b; 
begin integer /; 

]f a < b then begin i : = a; a : = b; b : = i end; 
/: / : = mod(a, b); a : = b; b : = i; 

if i 9^ 0 then go to /; gcd : = a 
end gcd; 

integer procedure isqrt(n, u); value n; integer n, u; 
begin integer j, yl, jl; 

j : = if n=0 then 1 else entier (sqrt(n))+l; 

jl :=jXj - n; 
/: if yl < 0 then 

begin 7I := yl+2Xy+l; y :=y+l; go to /end; 
/: y2:=yi-2xy+l; 

if 72 ^ 0 then 

begin jl : = jl; j : = j- 1; go to / end; 
isqrt : = j;u:= jl 
end isqrt; 
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procedure factor(«, r, /); value n, r; integer n, r, /; 
begin integer /, 7, p; 

integer array c[l : 8]; 

Boolean array qr[0 : 728]; 

procedure large(m, wO); value m, mO; integer m, mO; 
begin integer /, /I, j, jump, k, 5, t, m, x, 3;; Boolean odd; 

s := 1; fc : = //lO; 
start: 

fc : = k+cls]; s : = if s=m then 1 else s+l; 

a k^r then 

begin 

x := isqrt(4XfcXn, u); y : = (isqrt(Ai-5-fc, 0 - l)^(4X(r+l)); 
if mod(x+fc, 2) = 0 then 

begin il := l;u:= m+2Xx+1; x : = x+l end else il :=0; 
odd : =mod(fc, 2) = 1; jump : = if odd then 4 else 2; 
if odd then 
begin 

if mod(fc+«, 4) = mod(A:, 4) then 
begin n : = /l+2; w : = m+4X(x+1); x : = x+2 end 
end; 

for / : = il step jump until j+ 1 do 
begin 

if qr[mod (m, 729)] then 
begin 

y : = isqrt(M, 0; 

if / = 0 then 

begin 

p : = gcd(«, x-y); ifp > n-^p then p: = n-^p; 
go to exit 
end; 

comment When a factor p is found, we leave the 
procedure by going to exit; 
end; 

if odd then begin u : = w+8X(x+2); x : = x+4 end 
else 

begin u : = m+4X(x+1); x : = x + 2 end 
end; 

go to start 
end 
end large; 

for / : =0 step 1 until 728 do qr[/] : = false; 
for I : = 0 step 1 until 364 do 
begin j : = mod(/X/, 729); qv[j] : = true end; 
c[l]: = 30; large(l, 0); 

c[l] : = 48; c[2] : = c[3] : = c[4] : = 24; large(4, - 24); 
c[l] : = c[2] : = c[4] : = 24; c[3] : = 48; large(4, - 12); 
c[l] : = c[2] : = c[4] : = 36; c[3] : = 72; large(4, -18); 
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c[l] : =c(4] : =(t6] : = 12; c[2] : = c[8] : =36; 

c(3] : = c[5] : = c[7] : = 24; large(8, -6); 

c[l]:=4;c[2]:=2;large(2, -2); 

c[ll: = 2;large(l, -1); 

comment No factor has been found; 

P- 1; 

exit j: - p 
end factor; 
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